Contact Tracing, a core disease control measure employed by local and state health department personnel for decades, is a key strategy for preventing further spread of COVID-19. A relational database has been implemented to store information about a mobile application user who is tracked based on GPS which would aid to find the users who have come in contact with the infected patients. Surveys are conducted for users who can record their data about temperature, symptoms, etc. Aditionally information about labs and tests conducted in particular cities is maintained.
The following are the key assumptions that we made with respect to developing the contact tracing database/models:
1. We have modeled our application such that it can be used for contact tracing for the Covid-19 pandemic, but with minor modifications it can be used for generic contact tracing and for analytical use cases as well.
2. An app user is anyone who can use the contact tracing application and can be divided into two categories: a naïve user and a health monitor admin. An app user can have only a single phone number.
3. A naïve user is one who can use the application and is regularly surveyed for symptoms and tracked to check if they have come in contact with an infected user. A naïve user is either infected, not-infected or suspected based on the *infection status** which is given to him/her based on the results of tests conducted by a particular lab that are verified by the health monitor admin.
4. A health monitor admin can view information regarding naïve users and will be notified if a naïve user in a region that the admin is responsible for monitoring has a high risk of having contracted Covid-19.
5. A naïve user will also be notified if he/she is at risk of contracting Covid-19. The risk (can be calculated using triggers) of a naïve user having contracted Covid-19 depends upon various factors such as:
user symptoms and other information that get recorded as a part of regular surveys
medical condition of the user that compromises their immunity
travel history of user within a given time period
user coming in contact with zero or more infected user based on GPS tracking.
6. A naïve user is tracked based on GPS coordinates and we have considered “gpsLoc” as an attribute that uniquely identifies the coordinates the user on a particular date as we have considered the virus to be active/present where the infected person has been to for at least one day.
Conceptual model showing the various entities at a high level is built using Visual Paradigm.
Fig 3.1: Conceptual Model for Contact Tracing in Visual Paradigm
Logical diagram with Crow’s Foot Notation was built using Lucid Chart.
Fig 4.1: Logical Model for Contact Tracing in Lucid Chart
The logical schema shown above is resolved into the below relational schema. A relation R with attributes A1,A2,A3..An is shown as R(A1,A2,A3..An), where the primary key is underlined and foreign keys are shown in italics.
AppUser (userID, userPassword, firstName, lastName, phone)
HealthMonitorAdmin (adminID, regLat, regLong, qualification)
NaiveUser (naiveUserID, gender, immunoCompromised, DOB, lastSync, livingSituation, infectionStatus, zipcode)
Tracker (trackerID, gpsLoc, naiveUserID)
GPSlocation ( gpsLoc, gpsLat, gpsLong, date)
TravelHistory (country, naiveUserID)
Survey (surveyID, temperature, maskWorn, dateOfSurvey, naiveUserID)
Symptom (symptomID, symptomName)
SurveySymptom (symptomID, surveyID)
Test (testID, testType, testDate, result, naiveUserID, labID)
Lab (labID, labName, zipcode)
Address (zipcode, city, state)
Notification (notificationID, infectionRisk, userID)
A relational schema is considered to be in BCNF if for every one of its dependencies X → Y one of the following conditions holds true:
X → Y is a trivial functional dependency (i.e Y is a subset of X)
X is a superkey for the schema
AppUser
userID → { userPassword, firstName, lastName, phone }
phone → { userID, userPassword, firstName, lastName }
(Here, there are two candidate keys userID and phone which uniquely identify the non-key attributes and hence satisfies BCNF)
HealthMonitorAdmin
adminID → { regLat, regLong, qualification }
NaiveUser
naiveUserID → { gender, immunoCompromised, DOB, lastSync, livingSituation, infectionStatus, zipcode }
GPSlocation
gpsLoc → { gpsLat, gpsLong, date }
Tracker
trackerID → { gpsLoc, naiveUserID }
TravelHistory
country, naiveUserID form a composite primary key (trivial FD hence in BCNF)
Survey
surveyID → { temperature, maskWorn, dateOfSurvey, naiveUserID }
Symptom
symptomID → symptomName
SurveySymptom
surveyID, symptomID form a composite primary key (trivial FD hence in BCNF)
Test
testID → { testType, testDate, result, naiveUserID, labID }
Lab
labID → { labName, zipcode }
Address
Zipcode → { city, state }
Notification
notificationID → { infectionRisk, userID }
Tables created in MySQL database using CREATE statements
Fig 7.1: Schema creation in MySQL
CREATE Statements
CREATE DATABASE contactTracer;
USE contactTracer;
CREATE TABLE appUser(
userID INT PRIMARY KEY,
userPassword VARCHAR(30) NOT NULL,
firstName CHAR(30) NOT NULL,
lastName CHAR(30),
phone VARCHAR(13) NOT NULL UNIQUE
);
CREATE TABLE address(
zipcode INT PRIMARY KEY,
city CHAR(20) NOT NULL,
state CHAR(20) NOT NULL
);
CREATE TABLE healthMonitorAdmin(
adminID INT PRIMARY KEY,
regLat DECIMAL(10,8) signed NOT NULL,
regLong DECIMAL(11,8) signed NOT NULL,
qualification VARCHAR(30),
CONSTRAINT hm_admin FOREIGN KEY(adminID) REFERENCES appUser(userID)
);
CREATE TABLE naiveUser(
naiveUserID INT PRIMARY KEY,
gender ENUM(‘Male’, ‘Female’, ‘Other’),
dob DATE NOT NULL,
immunoCompromised BOOLEAN NOT NULL,
lastSync DATETIME,
livingSituation ENUM(‘Group’,‘isolated’),
infectionStatus ENUM(‘infected’,‘suspected’,‘not-infected’) NOT NULL,
zipcode INT NOT NULL,
CONSTRAINT naivU FOREIGN KEY(naiveUserID) REFERENCES appUser(userID),
CONSTRAINT zip FOREIGN KEY(zipcode) REFERENCES address(zipcode)
);
CREATE TABLE GPSlocation(
gpsLoc INT PRIMARY KEY,
gpsLat DECIMAL(10,8) signed NOT NULL,
gpsLong DECIMAL(11,8) signed NOT NULL,
date DATE NOT NULL
);
CREATE TABLE tracker(
trackerID INT PRIMARY KEY,
gpsLoc INT NOT NULL,
naiveUserID INT NOT NULL,
CONSTRAINT gps_track FOREIGN KEY(gpsLoc) REFERENCES GPSlocation(gpsLoc),
CONSTRAINT track_naive FOREIGN KEY(naiveUserID) REFERENCES naiveUser(naiveUserID)
);
CREATE TABLE travelHistory(
country VARCHAR(20),
naiveUserID INT,
CONSTRAINT cn_naive PRIMARY KEY (country, naiveUserID),
CONSTRAINT travel_naive FOREIGN KEY(naiveUserID) REFERENCES naiveUser(naiveUserID)
);
CREATE TABLE survey(
surveyID INT PRIMARY KEY,
temperature INT,
maskWorn BOOLEAN NOT NULL,
dateOfSurvey DATE NOT NULL,
naiveUserID INT NOT NULL,
CONSTRAINT survey_user FOREIGN KEY(naiveUserID) REFERENCES naiveUser(naiveUserID)
);
CREATE TABLE lab(
labID INT PRIMARY KEY,
labName VARCHAR(30),
zipcode INT NOT NULL,
CONSTRAINT zip_lab FOREIGN KEY(zipcode) REFERENCES address(zipcode)
);
CREATE TABLE test(
testID INT PRIMARY KEY,
testType ENUM(‘Antibody’,‘RT-PCR’),
testDate DATE NOT NULL,
result BOOLEAN NOT NULL,
naiveUserID INT NOT NULL,
labID INT NOT NULL,
CONSTRAINT test_lab FOREIGN KEY(labID) REFERENCES lab(labID),
CONSTRAINT test_user FOREIGN KEY(naiveUserID) REFERENCES naiveUser(naiveUserID)
);
CREATE TABLE symptom(
symptomID INT PRIMARY KEY,
symptomName ENUM(‘Fever’, ‘Dry Cough’, ‘Chest Pain’, ‘Shortness of Breath’, ‘tiredness’)
);
CREATE TABLE surveySymptom (
symptomID INT,
surveyID INT,
CONSTRAINT sy_sr PRIMARY KEY(symptomID,surveyID),
CONSTRAINT sym_sur FOREIGN KEY(symptomID) REFERENCES symptom(symptomID),
CONSTRAINT sur_sym FOREIGN KEY(surveyID) REFERENCES survey(surveyID)
);
CREATE TABLE notification(
notificationID INT PRIMARY KEY,
infectionRisk ENUM(‘High’,‘loW’),
userID INT NOT NULL,
CONSTRAINT notif_user FOREIGN KEY(NotificationID) REFERENCES appUser(userID)
);
Check to ensure that none of the constraints are violated.
Fig 8.1: Proof of constraint checks
Relevant dummy data inserted into the database using www.generatedata.com
Fig 9.1: Sample insert statements in MySQL
Fig 9.2: Result of Select * from table naiveUser
## [1] "address" "appuser" "gpslocation"
## [4] "healthmonitoradmin" "lab" "naiveuser"
## [7] "notification" "survey" "surveysymptom"
## [10] "symptom" "test" "tracker"
## [13] "travelhistory"
1.(Join of 3 tables) Names of users from cities Grand Island, Harrisburg and Fayetteville who have a travel history to any country
query1<- "SELECT appUser.firstName , appUser.lastName, address.city, travelHistory.country 'Country travelled to'
FROM appUser,NaiveUser,travelHistory, address
WHERE appUser.userID = naiveUser.naiveUserID AND
naiveUser.naiveUserID= travelHistory.naiveUserID AND
address.zipcode=naiveUser.zipcode AND
address.city IN ('Grand Island','Fayetteville','Harrisburg');"
## firstName lastName city Country travelled to
## 1 Allistair Conway Grand Island Japan
## 2 Maile Lott Grand Island Japan
## 3 Ira Leon Grand Island Uganda
## 4 Alden Carson Grand Island Sweden
## 5 Aiko Barrett Grand Island Kuwait
## 6 Allen Duke Grand Island Sweden
## 7 Upton Bradley Grand Island Botswana
## 8 Rashad Jackson Fayetteville Uganda
## 9 Chastity Bruce Fayetteville Andorra
## 10 Kessie Pena Fayetteville Sweden
## 11 Christen Bowen Harrisburg Guinea-Bissau
## 12 Wyatt Lancaster Harrisburg Sweden
## 13 Judah Sims Harrisburg Nauru
## 14 Keith Hess Harrisburg Australia
2.(Subquery) Find non-infected users and their living situation who have come in contact with a infected person (CONTACT TRACING) based on GPS location and date
query2<-"SELECT distinct naiveUser.naiveUserID, naiveUser.livingSituation
FROM naiveUser, tracker, GPSlocation
WHERE naiveUser.naiveUserID = tracker.naiveUserID
AND tracker.gpsLoc = GPSlocation.gpsLoc
AND naiveUser.infectionStatus != 'infected'
AND tracker.gpsLoc IN ( SELECT tracker.gpsLoc FROM naiveUser, tracker
WHERE naiveUser.naiveUserID = tracker.naiveUserID
AND naiveUser.infectionStatus = 'infected');"
## naiveUserID livingSituation
## 1 3 Group
## 2 4 isolated
## 3 5 isolated
## 4 7 isolated
## 5 10 Group
## 6 14 isolated
## 7 19 isolated
## 8 23 isolated
## 9 24 isolated
## 10 36 isolated
## 11 40 Group
## 12 46 isolated
## 13 52 Group
3.(Use having and Group By) List of all labs which have conducted more than 2 tests
query3<-"SELECT Count(test.testID) AS 'Test Count', lab.labName 'Lab Name'
FROM lab,test
WHERE test.labID = lab.labID
GROUP BY lab.labID
HAVING COUNT(test.testID)>1 ;"
## Test Count Lab Name
## 1 2 mus
## 2 2 lacus
## 3 2 nisl
## 4 2 mollis
## 5 2 at
## 6 2 cursus
## 7 2 etpoy
## 8 2 malesuada
4.(Complex query using SELECT CASE/WHEN)To determine risk status of a person
query4<-"SELECT naiveUserID, infectionStatus, immunoCompromised,
CASE WHEN infectionStatus = 'suspected' AND immunoCompromised = TRUE THEN 'This user is at high risk'
WHEN infectionStatus = 'suspected' AND immunoCompromised = FALSE THEN 'This user is at moderate risk'
WHEN infectionStatus = 'infected' THEN 'This user is Covid-19 positive'
WHEN infectionStatus = 'not-infected' AND immunoCompromised = TRUE THEN 'This user is at low risk'
ELSE 'This User is safe'
END AS Alert
FROM naiveUser;"
## naiveUserID infectionStatus immunoCompromised Alert
## 1 1 infected 1 This user is Covid-19 positive
## 2 2 suspected 1 This user is at high risk
## 3 3 not-infected 1 This user is at low risk
## 4 4 not-infected 1 This user is at low risk
## 5 5 not-infected 0 This User is safe
## 6 6 not-infected 1 This user is at low risk
## 7 7 suspected 1 This user is at high risk
## 8 8 not-infected 0 This User is safe
## 9 9 not-infected 1 This user is at low risk
## 10 10 not-infected 1 This user is at low risk
## 11 11 infected 1 This user is Covid-19 positive
## 12 12 not-infected 0 This User is safe
## 13 13 not-infected 0 This User is safe
## 14 14 suspected 1 This user is at high risk
## 15 15 infected 1 This user is Covid-19 positive
## 16 16 infected 0 This user is Covid-19 positive
## 17 17 not-infected 0 This User is safe
## 18 18 not-infected 0 This User is safe
## 19 19 not-infected 1 This user is at low risk
## 20 20 infected 0 This user is Covid-19 positive
## 21 21 not-infected 0 This User is safe
## 22 22 not-infected 1 This user is at low risk
## 23 23 not-infected 0 This User is safe
## 24 24 suspected 0 This user is at moderate risk
## 25 25 suspected 0 This user is at moderate risk
## 26 26 suspected 1 This user is at high risk
## 27 27 suspected 0 This user is at moderate risk
## 28 28 not-infected 0 This User is safe
## 29 29 infected 0 This user is Covid-19 positive
## 30 30 not-infected 0 This User is safe
## 31 31 not-infected 0 This User is safe
## 32 32 infected 1 This user is Covid-19 positive
## 33 33 not-infected 1 This user is at low risk
## 34 34 not-infected 1 This user is at low risk
## 35 35 suspected 1 This user is at high risk
## 36 36 not-infected 1 This user is at low risk
## 37 37 suspected 0 This user is at moderate risk
## 38 38 not-infected 0 This User is safe
## 39 39 not-infected 0 This User is safe
## 40 40 not-infected 1 This user is at low risk
## 41 41 not-infected 0 This User is safe
## 42 42 infected 0 This user is Covid-19 positive
## 43 43 infected 1 This user is Covid-19 positive
## 44 44 not-infected 1 This user is at low risk
## 45 45 infected 0 This user is Covid-19 positive
## 46 46 suspected 1 This user is at high risk
## 47 47 infected 1 This user is Covid-19 positive
## 48 48 infected 1 This user is Covid-19 positive
## 49 49 suspected 0 This user is at moderate risk
## 50 50 infected 1 This user is Covid-19 positive
## 51 51 infected 0 This user is Covid-19 positive
## 52 52 suspected 0 This user is at moderate risk
## 53 53 not-infected 1 This user is at low risk
## 54 54 suspected 0 This user is at moderate risk
## 55 55 not-infected 1 This user is at low risk
5.(Query of choice) GPS coordinates of infected users for Plotting- dummy data
query5<-"SELECT gpslocation.gpsLat AS Lattitude, gpslocation.gpsLong AS Longitude
FROM naiveUser, tracker, gpslocation
WHERE naiveUser.naiveUserID = tracker.naiveUserID AND
tracker.gpsLoc = gpslocation.gpsLoc
AND naiveUser.infectionStatus = 'infected';"
## Lattitude Longitude
## 1 47.40403 131.83584
## 2 22.38534 35.53054
## 3 52.10685 142.92803
## 4 86.66287 22.99641
## 5 47.40403 131.83584
## 6 1.88978 -21.19278
## 7 -86.95076 -131.54891
## 8 -56.85797 94.08396
## 9 36.54954 -157.57636
## 10 39.21384 -66.08290
## 11 -26.39023 139.71786
## 12 -85.96642 -127.08869
## 13 -87.20021 36.18430
## 14 64.46163 104.56520
## 15 -27.58789 68.46440
## 16 -31.69499 24.58768
## 17 -27.58789 68.46440
6.(optional) Select of number of infected cases per city
query7<-"SELECT count(naiveUser.naiveUserID) AS 'infected case count', address.city
FROM naiveUser, address
WHERE naiveUser.zipcode= address.zipcode AND
naiveUser.infectionStatus = 'infected'
GROUP BY(address.city);
"
## infected case count city
## 1 7 Flint
## 2 1 Grand Island
## 3 2 Helena
## 4 1 Kapolei
## 5 1 New Orleans
## 6 1 San Antonio
## 7 1 South Portland